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1.0   Introduction  MONTEREY  CA  93943-5101 

The  objective  analysis  of  meteorological  data  is  increasing- 
ly being  performed  by  methods  based  on  "Optimum  Interpolation" , 
or  01.   01  is  a  statistical  scheme  with  its  roots  in  the  ideas  of 
Weiner  and  Kolmogorov  and  was  introduced  to  the  meteorological 
literature  by  Gandin  (1963).   At  about  the  same  time  the  ideas 
were  also  being  developed  independently  in  other  scientific 
fields,  including  geophysics,  mining,  and  electrical  engineering. 
The  theoretical  basis  of  01  requires  its  application  to  a  sta- 
tionary random  field  with  known  statistical  properties  (for  our 
purposes  this  is  for  an  ensemble  average  taken  over  time),  in 
particular  the  spatial  mean  and  covariance  function  are  assumed 
to  be  known.   In  addition,  errors  in  the  measurements  of  the 
field  (observations)  are  assumed  to  be  made  with  known  standard 
deviation.   In  meteorology  it  is  usually  assumed  that  the  field 
has  zero  mean,  although  this  is  not  strictly  required,  and  as 
part  of  the  process  the  mean  can  be  estimated  along  with  the 
field  values. 

In  meteorology,  the  process  generally  proceeds  along  the 
following  lines  for  the  univariate  case.   The  field  to  be  approx- 
imated is  the  deviation  of  a  first-guess  field  (e.g.,   pressure 
heights)  from  the  true  value  of  the  field.   The  first-guess  field 
can  be  obtained  from  climatology,  but  in  current  practice  is 
usually  the  predicted  value  from  a  numerical  weather  prediction 
model.   The  first-guess  errors  are  obtained  by  subtracting  the 
first-guess  from  the  observed  values.   First-guess  values  are 
typically  known  on  a  regularly  spaced  grid,  and  these  values  are 


interpolated  to  the  irregularly  spaced  observation  points.   The 
interpolation  of  first-guess  errors  from  the  observation  points 
to  the  grid  points  is  where  the  application  of  01  is  carried  out. 
Current  applications  are  usually  in  a  multivariate  setting, 
taking  into  account  the  physical  relationship  between  meteorolog- 
ical variables  such  as  winds  and  pressure  heights.   This  requires 
that  cross  covariance  functions  be  known  or  that  functional 
relationships  between  the  variables  be  assumed. 

The  practical  application  of  01  requires  a  great  many 
compromises,  so  the  actual  process  is  usually  referred  to  as 
statistical  interpolation  (SI).   Some  papers  dealing  with  the 
practical  problems  are  those  of  Rutherford  (1972),  Bergman 
(1979),  and  Lorenc  (1981).   The  first  compromise  to  be  faced  is 
that  of  estimating  the  spatial  covariance  function  for  the  first- 
guess  errors.   In  practice  the  covariance  function  is  estimated 
from  historical  data  (an  iterative  process,  since  improved  esti- 
mates will  change  the  first-guess  errors)  and  is  sometimes 
assumed  to  be  isotropic  although,  as  will  be  noted  later,  some 
gains  are  apparently  obtained  by  the  assumption  of  anisotropy. 
The  assumption  of  stationarity  is  certainly  not  satisfied  and  as 
a  practical  matter  is  not  necessarily  followed.   Since  the  method 
requires  the  solution  of  a  system  of  linear  equations  in  as  many 
unknowns  as  there  are  observations,  it  is  necessary  to  select  a 
subset  of  the  observations  to  be  used  for  obtaining  the  first- 
guess  error  at  a  particular  grid  point  (or  set  of  grid  points). 


The  most,  important  problem  in  the  entire  process  is  probably 
the  estimation  of  the  spatial  covariance  function  for  the  first- 
guess  error  field.   This  problem  has  been  addressed  by  numerous 
authors,  and  has  resulted  in  a  great  many  proposed  functional 
representations.   Some  of  these  are  listed  in  Appendix  A,  using  a 
standardized  notation,  with  references  to  some  of  their  appear- 
ances in  the  literature.   In  the  past,  one  commonly  used  covari- 
ance function  is  the  Gaussian,  or  negative  squared  exponential. 
While  this  function  has  been  used  in  practice  (see  Bergman  (1979) 
and  Lorenc  (1981)),  it  has  also  been  known  for  some  time  that  the 
function  has  inadequate  spectral  characteristics  (see  Julian  and 
Thiebaux  (1975)).   Another  problem  is  its  limited  number  of 
parameters,  making  it  difficult  to  accurately  fit  the  forecast 
error  statistics. 

While  it  has  been  recognized  for  many  years  that  the  covar- 
iance function  must  be  a  positive  definite  function,  this  has  not 
always  been  adhered  to  in  practice.   One  of  the  advantages  of  01 
is  that  an  estimate  of  the  mean-squared  error  is  easily  obtain- 
able, and  much  is  made  of  this  fact.   The  error  estimate  is  used 
in  quality  control  as  well  as  to  assess  the  effect  of  loss  of 
observations  on  the  accuracy  of  objective  analyses  (e.g.  see 
Thiebaux,  1980).   The  error  estimates  are  much  more  sensitive  to 
mis-specification  of  parameters  than  the  actual  analyses  (see 
Franke ,  1985),  and  the  use  of  nonpositive  definite  "covariance'* 
functions  will  undoubtedly  cause  additional  error.   Recent  empha- 
sis on  the  necessity  of  positive  definite  functions  has  begun  to 


make  an  impact  and  conditions  to  guarantee  the  property  can  be 
found  in  Section  2  of  this  report. 

One  class  of  positive  definite  functions  proposed  in  the 
last  ten  to  fifteen  years  is  the  class  from  the  covariance  func- 
tions of  autoregressive  models,  mainly  those  of  order  two  and 
three  (see  Thiebaux,  1976,  1985).   These  covariance  functions 
arise  from   random  processes  which  are  governed  by  certain  dif- 
ference equations  with  random  forcing.   The  underlying  process 
being  modeled  may  or  may  not  be  representative  of  weather  proces- 
ses, but  the  resulting  spatial  covariance  functions  have  enough 
parameters  to  model  the  raw  statistics  to  a  reasonable  degree. 
The  use  of  these  models  in  anisotropic  (product  type)  and  isotro- 
pic models  has  been  investigated  by  Thiebaux  (1976,  1985).   The 
use  of  the  one  dimensional  functions  as  isotropic  multidimen- 
sional covariance  functions  can  violate  the  positive  definite 
condition,  however,  as  is  shown  in  Section  3. 

Another  class  of  covariance  functions,  in  use  at  The 
European  Center  for  Medium  Range  Weather  Forecasting  (originally 
proposed  by  Rutherford  (1972),  and  later  discussed  by  Lonnberg 
(1982)  and  others)  is  the  Bessel  function  model.   As  will  be 
discussed  later,  a  convex  combination  of  Bessel  functions  J^(k.s) 
and  a  constant  will  automatically  be  positive  definite  and  will 
allow  sufficient  parameters  to  model  the  forecast  error 
statistics  very  well.   Compared  to  the  autoregressive  models, 
these  functions  require  considerably  more  computation  for  their 
evaluation. 


At  the  present  time  there  is  a  great  deal  known  about  what 
kinds  of  functions  are  suitable  for  use  as  multivariate  covar- 
iance  functions,  necessary  conditions  for  positive  def initeness , 
and  families  of  such  functions  which  embody  enough  parameters  to 
allow  fitting  the  forecast  error  statistics  sufficiently  well 
without  resorting  to  functions  which  are  unwieldy  for  computing. 
One  of  the  purposes  of  this  report  is  to  attempt  to  bring  toget- 
her in  one  place  the  information  in  a  form  accessible  to  practi- 
tioners in  the  field. 

The  class  of  positive  definite  functions,  characterizations 
of  them,  and  construction  of  such  functions  are  discussed  in 
Section  2.   In  Section  3  the  general  multivariate  case  are  ad- 
dressed, including  conditions  for  positive  definiteness  as  well 
as  smoothness  conditions  required  when  the  meteorological  vari- 
ables are  related  through  differentiation.   Use  of  isotropic 
covariance  functions  for  the  multidimensional  case,  conditions 
for  single  dimensional  positive  definite  functions  to  be  isotro- 
pic multidimensional  positive  definite  functions,  examples  and 
counterexamples  are  also  discussed.   In  section  4  construction  of 
possible  candidates  for  use  as  covariance  functions  is  discussed 
and  the  results  of  some  estimated  rms  error  calculations  under 
various  conditions  is  given.   A  listing  of  functions  which  have 
been  proposed  in  the  literature  as  covariance  functions  is  given 
in  Appendix  A,  and  some  simulations  conducted  on  estimation  of 
the  covariance  function  from  the  raw  forecast  error  statistics  is 
given  in  Appendix  B. 


2.0   Positive  Definite  Functions 

Covariance  functions  are  positive  semi-definite,  and  thus 
their  theory  is  closely  allied  with  that  of  positive  definite 
functions.   The  purpose  of  this  section  is  to  summarize  some  of 
what  is  known  about  positive  definite  functions,  and  point  out 
properties  that  are  of  use  in  the  present  context.   The  discus- 
sion will  be  confined  to  real  functions.   An  excellent  reference 
for  the  topic  is  Stewart  (1976). 

A  function  C(s)  is  said  to  be  positive  definite  if  for  any 
points  s^,  S2>  •••  ,  sn ,  and  any  values  t^ ,  t2 ,  ...  , t  ,  then 


■j'-^j  i0  • 


n 

1  c<»i-»j>*i 

i,  j  =  l 
This  condition  is  actually  only  positive  semi-definite,  but  to 

distinguish  between  the  two  requires  excluding  certain  sets  {s.} 
and  {t.},  and  the  possibility  of  equality  in  the  expression.   The 
additional  complication  does  not  add  to  the  theory  for  the  pur- 
poses here.   An  equivalent  condition  for  continuous  C(s)  is  that 
for  all  integrable  g(s)  with  finite  support, 


ff 

—  OO    —  CO 


C(s-t)  g(s)  g(t)  ds  dt  >  0 


The  class  of  positive  definite  (PD)  functions  have  some 
useful  properties  which  will  be  given  and  discussed.   In  what 
follows,  C(s)  will  denote  a  PD  function. 

(1)   C(s)  is  an  even  function,  and  this  in  turn  implies  that 
the  Fourier  transform  is  the  Fourier  Cosine  transform.   This 
permits  the  simplification  of  characterizing  PD  functions  in 
terms  of  their  Fourier  transforms.   Additionally,  it  will  be 

6 


useful  to  assume  that  the  argument  is  always  nonnegative  to  avoid 
having  to  indicate  absolute  values. 

(2)  The  magnitude  of  C(s)  is  bounded  by  its  value  at  the 
origin.   This  property  is  primarily  useful  to  detect  that  a 
function  is  not  PD. 

(3)  A  linear  combination  (with  positive  coefficients)  of  PD 
functions  is  PD.   This  makes  it  easy  to  construct  PD  functions 
with  parameters,  although  when  fitting  these  functions  to  data 
one  must  look  at  the  results  to  see  whether  the  coefficients 
obtained  are  positive. 

(4)  A  product  of  PD  functions  is  PD.   This  property  is  also 
an  aid  in  constructing  PD  functions  with  parameters  using  simple 
PD  functions  as  building  blocks. 

Positive  definite  functions  are  characterized  by  the  fol- 
lowing property 

Theorem:   C(s)  is  PD  if  and  only  if  the  Fourier  transform  of  C  is 
nonnegative . 

Thus,  PD  functions  are  the  (inverse)  Fourier  transforms  of  proba- 
bility density  functions  (Bochner's  Theorem,  see  Priestly  (v.l, 
1981)),  and  since  the  Fourier  transform  is  the  Fourier  Cosine 
transform,  the  inverse  transform  and  the  transform  differ  only  by 
a  constant.   The  Fourier  transform  of  a  probability  density 
function  is  called  a  characteristic  function. 

There  are  several  other  properties  that  are  sufficient  con- 
ditions for  a  function  to  be  PD. 


(1)  Laplace  transforms  of  probability  density  functions  are 
PD.   This  follows  easily  from  the  characterization  and  an  inter- 
change of  the  order  of  integration  in  the  Laplace  and  Fourier 
transforms . 

(2)  Convolutions  of  functions  with  themselves  are  PD,  i.e. 

/*  oo 

C(s)  =   lg(t)  g(t+s)  dt   is  PD. 

(3)  If  C(s)  is  convex  on  (0,  °°  )  and   lim  C(s)  =  0,  then 

X>oo 

C(s)  is  PD.   Unfortunately,  this  nice  condition  is  not  useful, 
since  it  will  be  seen  in  the  next  section  that  covariance  func- 
tions useful  for  multivariate  statistical  interpolation  can  not 
be  convex . 

For  illustrative  purposes,  and  to  point  out  some  examples 
that  will  be  useful  later  on,  we  note  that  the  following  func- 

— CS       -pq2  -pc 

tions  are  positive  definite:   e    ,  e     ,  (1  +  cs)e  "  ,  cos(as), 
1,  and  JQ(as).   The  following  are  not  positive  definite,  although 
some  of  them  do  not  seem  to  be  significantly  different  than  the 
former:   ( s4  +  l)"1,  e~as3 ,  and  f(s)  =  1  for  0<s< 1 ,  f(s)  =  0  for 
s>l. 

The  above  examples  can  easily  be  seen  to  be  PD  or  not  by 
inspection  of  the  Fourier  transforms.   Because  the  constant  func- 
tion is  PD,  its  use  as  a  (positive)  additive  term  in  modeling 
covariance  functions  can  be  useful  over  a  given  spatial  range. 
It  must,  however,  not  be  truncated  at  some  fixed  distance  (less 
than  that  which  might  occur  in  practice)  since  the  resultant 
function  is  not  PD.   Additionally,  the  above  functions  are 


considered  only  as  functions  of  one  spatial  variable.   As  func- 
tions of  several  spatial  variables,  the  same  characterizations 
hold,  although  functions  which  have  a  positive  Fourier  transform 
in  one  dimension  may  not  have  positive  transforms  in  more  than 
one  dimension,  when  considered  as  isotropic  functions,  i.e., 
functions  of  "distance" .   Since  questions  of  this  sort  are  close- 
ly tied  into  the  problem  at  hand,  their  discussion  is  deferred  to 
Section  3. 


3.0  Multivariate  Covariance  Functions 

3 . 1  General  Development 

The  development  of  the  equations  for  multivariate  applica- 
tion of  01  to  related  variables  is  discussed  in  several  papers, 
including  Schlatter,  et .  al .  (1976),  Bergman  (1979),  Lorenc 
(1981),  Thiebaux  (1985),  and  Pedder  and  Thiebaux  (1985).   At 
first  glance  the  results  do  not  always  seem  to  be  the  same, 
however  this  is  primarily  due  to  different  notation  being  used  in 
such  a  way  as  to  be  potentially  confusing. 

For  completeness,  the  derivation  of  the  relationship  between 
the  covariance  functions  and  cross-covariance  functions  for  var- 
iables related  through  differentiation  will  be  derived  here,  and 
the  differences  with  other  notations  in  use  noted.   Suppose  that 
it  is  wished  to  analyze  three  related  dependent  variables,  re- 
quiring that  the  corrections  obtained  via  01  (or  more  correctly, 
SI)  will  not  upset  the  relationship  between  the  predicted  values 
of  the  variables.   Let  the  error  in  the  predicted  variables  be 
denoted  by  Z(x,y),  X(x,y),  and  Y(x,y),  where  (x,y)  gives  the 
spatial  location  and  it  is  assumed  that  X(x,y)  =  k..Z  (x,y)  and 

Y(x,y)  -   k9Z  (x,y).   The  subscript  on  Z  denotes  partial  differen- 
^  y 

tiation.   Assume  that  the  errors  in  the  predicted  values  are 
stationary  (that  is,  the  statistics  do  not  depend  on  (x,y)),  with 
zero  mean.   Using  E[ . ]  to  denote  the  expected  value,  or  ensemble 
average,  the  spatial  covariance  function  for  Z,  as  a  function  of 
"lags"  s  and  t,  is 

R(s,t)  =  E[Z(x,y)Z(x+s,y+t)]  =  E[Z( x-s , y-t ) Z(x, y ) ]  . 
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The  latter  equality  follows  from  stationarity .   Under  the  assump- 
tion that  the  order  of  partial  differentiation  and  the  expected 
value  can  be  interchanged  the  cross  covariance  functions,  and  the 
covariance  functions  for  the  derived  variables  are 
E[Z(x,y)X(x+s,y+t)]  =  E[Z(x, y )kxZx (x+s , y+t ) ]  = 
E[Z(x,y)k1Zs(x  +  s,y  +  t)]  =  kxE[Z(x  ,  y  )  Z  (  x+s  ,  y+t )  ]s  =  k^Cs.t), 

E[Z(x,y)Y(x+s,y+t)]  =  E[Z( x, y )k2Zy ( x+s , y+t ) ]  = 
E[Z(x,y)k2Zt(x+s,y+t)]  =  k2E[Z( x , y ) Z ( x+s , y+t ) ]t  =  k2Rt(s,t), 

E[X(x,y)Z(x+s,y+t)]  =  E[X( x-s , y-t ) Z( x , y ) ]  = 
E[k1Zx(x-s,y-t)Z(x,y)]  =  E[  -kj_  Zg  (  x-s  ,  y-t )  Z(x ,  y )  ]  = 
-k1E[Z(x,y)Z(x+s,y+t)]s  =  -k^Cs.t), 

E[X(x,y)X(x+s,y+t)]  =  E[X( x, y Jk^C x+s , y+t ) ]  = 
k1E[X(x,y)Z(x+s,y+t)]s  =  k1E[X(x-s,y-t)Z(x,y)]a  = 
k1E[k1Zx(x-s,y-t)Z(x,y)]s  =  -k^E[Zs(x-s , y-t )Z(x , y ) ]s  = 
-k^E[Z(x,y)Z(x+s,y+t)]ss  =  -k^R^Cs.t), 

E[X(x,y)Y(x+s,y+t)]  =  E[X( x, y )k2Zy ( x+s , y+t ) ]  = 
E[X(x,y)k2Zt(x+s,y+t)]  =  k2E[X(x , y ) Z ( x+s , y+t ) ]t  = 
k2E[k1Zx(x-s,y-t)Z(x,y)]t  =  kgEC-kj Zg (x-s , y-t )Z(x , y ) ]t  = 
-k1k2E[Z(x,y)Z(x+s,y+t)]  =  -k^R^  (  s  ,  t ) 

E[Y(x,y)Z(x+s,y+t)]  =  E[k2Z  (x-s,y-t)Z(x,y)]  = 
E[-k2Zt(x-s,y-t)Z(x,y)]  =  -k2E[Z(x,y )Z(x+s,y+t) ]t  = 

2^t^s'^  ' 
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E[Y(x,y)X(x+s,y+t)]  =  E[Y(x, y )k1Zx(x+s , y+t ) ]  = 
E[Y(x,y)k1Zs(x  +  s,y  +  s)]  =  k^CYU-s  ,  y-t )  Z  (  x,  y  )  ]s  = 
k1E[k2Zy(x-s,y-t)Z(x,y)]s  =  k1E[-k2Zt(x-s,y-t)Z(x,y)]?  = 
-k1k2E[Z(x,y)Z(x+s,y+t)]st  =  -k^R^  (  s  ,  t )  , 

E[Y(x,y)Y(x+s,y+t)]  =  E[Y(x, y )k2Zy (x+s , y+t ) ]  = 
k2E[Y(x,y)Z(x+s,y+t)]t  =  k2E[Y(x-s , y-t ) Z( x , y ) ]t  = 
k2E[k2Zy(x-s,y-t)Z(x,y)]t  =  -k|E[Zt(x-s,y-t )Z(x, y ) ]  = 
-k|E[Z(x,y)Z(x+s,y+t)]tt  =  -k^Rtt(s,t)  . 

Note  that  while  the  covariance  functions  are  symmetric,  the 
cross  covariance  functions  are  antisymmetric,  which  accounts  for 
the  sign  change  that  comes  from  changing  the  order  of  the  product 
in  the  expected  value.   This  means,  among  other  things,  that  the 
cross  covariance  must  be  zero  at  zero  lag  values.   This  behavior 
can  be  seen  in  the  plots  of  the  various  functions  in  Bergman 
(1979)  and  Schlatter,  et.  al .  (1976).   In  those  papers,  the  signs 
which  appear  above  also  appear,  while  in  Thiebaux  (1985)  and 
Pedder  and  Thiebaux  (1986),  they  are  absent.   This  can  be  ex- 
plained from  the  order  in  which  processes  are  carried  out.   In 
the  above,  if  the  observation  points  are  (x. ,y. ),  then  the  lags 
for  a  particular  pair  (with  subscripts  i  and  j,  say)  are  (s,t)  = 
(x.-x. ,y.-y.   ),  and  the  covariance  function  for  Z  can  be  con- 

\)  -*-  %)  •*- 

sidered    as 


R(xi'yi' W  =  R<xrxi'yj-yi>  = 

E[Z(x,y)Z(x+xj-xi,y+yj-yi)    =    E[Z(x+xi , y+yi ) Z (x+Xj , y+y. ) ] 
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Now,  if  we  consider  the  cross  covariance,  and  think  of  the  points 
(x.,y.)  and  (x.,y.)  as  being  variables,  rather  than  fixed  points, 
then 

E[X(x+xi,y+yi)Y(x+xj,y+yj)]  =  [kx Zx ( x+x± , y+Yi ) Y( x  +  x. , y+y  . ) ]  : 

k1E[Zxi  (x+x.,t+yi)k2Zy(x+xj,y+yj)]  = 

k1k2E[Z(x+xi,y+yi)Z(x+xj,yj)]xiyj     =    Rxi yj ( Xi , y ± , x j , y . )     . 

The  other  covariance  and  cross  covariance  functions  are  handled 
in  a  similar  manner.   Differentiation  with  respect  to  x.  then  is 
the  negative  of  differentiation  with  respect  to  s,  while  differ- 
entiation with  respect  to  x.  is  the  same  as  differentiation  with 
respect  to  s  and  therefore  the  apparent  loss  of  the  signs. 

3 . 2  Some  Necessary  Properties 

In  order  for  the  covariances  of  the  derived  functions  and 
the  cross  covariance  functions  to  exist,  certain  conditions  must 
be  satisfied  by  the  function  R(s,t).   These  have  been  alluded  to 
by  Buell  (1972),  and  are  given  by  Julian  and  Thiebaux  (1975). 
The  conditions  given  are  for  an  isotropic  covariance  function. 

For  the  moment,  suppose  that  s  represents  the  lag  distance  (in 

2     2  1/2 
the  above ,( s   +  t  )    ),  and  that  the  covariance  function  R(s)  is 

isotropic.   Then  the  conditions  cited  are 

Rs  (  s  )  R  (  s  ) 

lim  is  finite,  and   lim[  -  R   (s)  1=0 

s>0    S  s>0     S       SS 

When  one  considers  that  R  (0)  must  be  zero,  the  first  limit  is 

the  definition  of  the  second  derivative  at  s=0,  hence  existence 

of  the  limit  means  that  the  covariance  function  must  be  twice 

dif f erentiable  at  s=0.   The  second  limit  then  says  that  the 
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second  derivative  is  continuous  at  s=0.   Thus  the  theorem  given 
by  Julian  and  Thiebaux  can  be  replaced  with  a  simpler  one: 

Theorem  1:   If  R(s)  is  an  isotropic  covariance  function  for  Z  in 
two  dimensions,  then  the  covariance  functions  for  the  partial 
derivatives  of  Z  exist  at  s=0  if  and  only  if  R(s)  is  twice 
continuously  dif f erentiable  at  s=0 . 

3.3  Anisotropic  Functions 

It  has  been  contended  that  isotropic  covariance  functions  do 
not  adequately  model  the  forecast  error  statistics  and  that  gains 
can  be  made  by  using  anisotropic  functions.   See  Thiebaux  (1976, 
1977,  1985),  and  Thiebaux,  et .  al .  (1985)  for  development  and 
discussion  of  product  forms  of  covariance  functions.   Use  of 
products  of  single  dimensional  functions  has  the  advantage  of 
carrying  over  desirable  properties  to  higher  dimensions,  as  well 
as  being  able  to  use  essentially  one  dimensional  structures  and 
techniques.   On  the  other  hand,  perusal  of  contour  plots  of 
product  functions  show  that  zero  crossings  of  the  functions  occur 
along  grid  lines,  and  it  is  easy  to  see  this  will  always  happen. 
This  may  be  undesirable  behavior,  and  almost  certainly  it  is  not 
the  kind  of  behavior  seen  in  the  error  statistics. 

Another  form  of  anisotropy  is  possible,  one  which  results 
from  scaling  differently  in  two  orthogonal  directions,  then  using 
an  isotropic  function  in  the  scaled  variables.   Presumably,  but 
not  necessarily,  the  two  directions  would  be  the  longitude  and 
latitude  directions.   This  would  result  in  the  zero  crossings  in 
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"the  contour  plots  of  "the  function  being  ellipses  with  axes  in  the 
two  directions;  all  contours  would  have  the  same  shape.   The 
eccentricity  of  the  ellipse  is  a  measure  of  the  anisotropy  of  the 
error  statistics.   It  would  be  easy  to  allow  rotation  along  with 
the  scaling  to  obtain  ellipses  of  constant  "distance"  with  any 
axis  orientation.   No  references  to  use  of  this  kind  of  anisot- 
ropy in  the  meteorological  literature  have  come  to  my  attention. 
The  properties  of  any  such  functions  are  those  of  isotropic 
functions,  of  course,  since  the  anisotropy  arises  purely  from  a 
rotation  and  scaling. 

3.4   Isotropic  Functions 

The  use  of  isotropic  functions  in  two  or  more  dimensions 
which  have  been  derived  from  one  dimensional  considerations  can 
possibly  lead  to  nonpositive  definite  functions.   For  example, 
Ripley  (1981,  p.  11)  quotes  a  result  of  Matern  (1960),  which 
gives  a  lower  bound  for  isotropic  positive  definite  functions  in 
several  dimensions.   The  result  means  that  positive  definite 
functions  in  two  dimensions  are  necessarily  bounded  below  by 
-0.403  (the  minimum  value  of  J~  ( s )  ),  while  in  three  dimensions 
the  bound  is  -0.218     Thus  any  oscillatory  positive  definite 
function  in  one  dimension  which  takes  on  values  less  than  -0.403 
cannot  be  an  isotropic  positive  definite  function  in  two 
dimensions.   A  positive  definite  function  which  has  parameters  to 
separately  control  the  oscillation  frequency  and  the  decay  can 
probably  be  made  into  a  nonpositive  definite  isotropic  function 
in  two  dimensions.   In  particular,  a  function  such  as 
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f(s)  =  cos(as)  exp(-bs) 
can  be  made  nonpositive  definite  by  suitable  choice  of  parame- 
ters, say  a  -    5  and  b  =  .1  .   This  result  also  applies  to  the 
isotropic  version  of  the  second  order  auto-regressive  (SOAR) 
function  as  well,  although  a  restriction  on  the  ratio  of  the 
parameters  in  the  function  will  guarantee  it  is  positive  definite 
in  two  dimensions.   In  practice  it  has  been  shown  that  one  of  the 
parameters  tends  to  be  zero  in  the  SOAR  when  fit  to  meteorologi- 
cal data  (Thiebaux,  et.  al .  (1985),  and  Section  4  of  this  re- 
port).  This  satisfies  the  necessary  requirement,  and  hence  the 
function  as  used  is  positive  definite.   The  details  will  be  given 
later. 

There  is  a  one-to-one  correspondence  between  covariance 
functions  in  one  dimension  and  isotropic  covariance  functions  in 
two  dimensions.   Matheron  (1973)  gives  a  way  of  generating  an 
isotropic  d-dimensional  covariance  function  from  a  one  dimen- 
sional covariance  function,  the  so-called  "turning  band"  method. 
The  relation  is 


d 


K[C1 

Jo 


C,(s)  =  K  I  C,  (vs)(l-v2)(d  3)/2  dv  , 


where  K  is  a  constant  that  is  unimportant  for  our  purposes.   In 
two  dimensions,  this  gives 


K  I  Cx(v 

^0 


C9(s)  =  K  I  C,  (vs)(l-v2)  1/2 


dv 


Although  Matheron  doesn't  give  the  result,  it  is  possible  to 
invert  this  relation  to  show  the  one-to-one  relationship.   A 
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sketch  of  the  inversion  process  follows.   A  change  of  variables 
in  the  previous  expression  gives 

C2(s)  =  K  /  C1(t)(s2-t2)~1/2  dt  . 

2  2 

A  further  change  of  variables,  s   =  x ,  and  t  -    y,    yields 

C2(x1/2)  =  K  /  [C1(y1/2)(2y)"1/2](y-x)"1/2  dt  . 
■'0 
This  is  Abel's  equation  for  K  C-  (y 1/2 ) ( 2y ) ~1/2 ,  and  the  solution 

is  well  known  (see  Hochstadt,  1972)  and  is  given  by 


/Z)(X-y)-^Z   dy  , 


where  K'  is  a  different  constant,  and  upon  substitution  for  s  and 
t  once  again,  gives 

C^s)  =  K's  ^j-^-y  /  C2(t)(s2-t2)"1/22t  dt  . 

The  correspondence  between  covariances  in  one  dimension  and 
three  dimensions  is  easier  to  invert,  and  is  given  by  Ripley 
(1981).   There  the  relation  is 


A 


C3(s)  =/C1(vs)  dv,  and   C^s)  =  ^  [sC3(s)] 


While  this  characterization  of  multidimensional  isotropic 
covariance  functions  is  interesting,  and  can  in  fact  be  used  to 
generate  isotropic  multidimensional  covariance  functions,  it  does 
not  easily  answer  the  question  as  to  whether  or  not  a  particular 
one  dimensional  function  is  an  isotropic  positive  definite  func- 
tion in  more  dimensions.   One  way  to  answer  such  a  question  is  to 
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use  the  characterization  of  positive  definite  functions  as 
Fourier  transforms  of  probability  density  functions  (or  alterna- 
tively, as  functions  whose  Fourier  transform  is  positive).   The 

Fourier  transform  of  an  isotropic  function  C(s)  in  two  dimensions 

1  /2 
becomes  (essentially)  the  Hankel  transform  of  s    C(s).   It  may 

be  considerably  easier  to  look  at  the  one  dimensional  Fourier 

transform,  however,  so  it  would  be  useful  to  have  a  sufficient 

condition  on  the  Fourier  transform  of  the  function  which  would 

guarantee  it  is  an  isotropic  positive  definite  function  in  two 

dimensions.   Such  a  condition  will  now  be  derived. 

Let  C. (s)  be  a  positive  definite  function  of  one  variable. 

By  the  characterization  in  the  previous  section,  write 

(1)   C1(s)  =   /  cos(rs)  h(r)  dr, 


-L 


for  some  probability  density  function  h(r)   (i.e.,  h(r)>0,  with 
integral  equal  1).   Then,  under  what  conditions  will  C.(s)  be  the 
two  dimensional  Fourier  transform  of  an  isotropic  probability 
density  function?   Such  a  transform  is  necessarily  isotropic.   A 
function  g(s)  is  sought  so  that 


(2)      C,  (r)  =   Jn(rs)  s  g(s)  ds 
This  expression  is  inverted  using  the  Hankel  transform,  giving 


■ft 


(3)      g(s)  =   /JQ(sr)  r  (^(r)  dr. 

'o 

Then,  using  (1)  in  (3),  and  interchanging  the  order  of  integra- 
tion, followed  by  integration  by  parts  yields 
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g(s)    =      lJ0(sr)    r    (       /cos(tr)    h(t)    dt)dr    = 


/  h(t)    (       /Jn(sr)    r   cos(tr)    dr)    dt    = 


JO 
fh(t)    (-^/J0(sr)    sin 

Jo  Jo 


in(tr)    dr)    dt    = 


p. 


Jh'(t)    (    /Jn(sr)    sin(tr)    dr    )    dt 
'o  ^0 


and    then, 


(4)  g(s)    =    -   /h'(t)/(t2-s2)1/2    dt    . 

•'s 

- 1  /2 
The  last  equality  uses  the  Hankel  transform  of  r   /  sin(tr). 

In  order  for  g(s)  to  be  a  probability  density  function  it 

must  be  nonnegative  with  integral  equal  to  one.   It  is  easy  to 

show  (again,  interchanging  the  order  of  integration)  that  the 

integral  is  equal  to  one.   Necessary  and  sufficient  conditions 

for  the  nonnegativity  of  g(s)  are  harder.   The  following  is 

apparent . 

Theorem  2-      A  sufficient  condition  for  C.  ( s )  to  be  a  valid  iso- 
tropic covariance  function  in  two  dimensions  is  that  h(t)  be  a 
monotone  decreasing  (h'(t)<0)  function. 

This  condition  seems  unnecessarily  restrictive,  and  further 
is  not  as  revealing  as  would  be  convenient  since  the  condition  is 
on  the  Fourier  cosine  transform  of  C.(s)  rather  than  C.  ( s ) 
itself.   Nonetheless,  the  condition  can  be  used  to  show  the 
following  results,  which  are  of  definite  interest. 
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(1)  Consider  the  exponentially  damped  cosine  function, 
C(s)  =  cos(as)e   s. 

The  Fourier  cosine  transform  of  this  function  is 

2      2 

Inspection  of  h'  (t)  shows  that  if  b   >_  3a  ,  it  is  nonpositive  fo] 

all  t,  and  hence  h(t)  is  monotone  decreasing  under  that 
constraint . 

(2)  Consider  the  second  order  autoregressive  (SOAR)  covar- 
iance  function, 

C(s)  =  [cos(as)  +  (b/a)sin(as)]e~bs. 
The  Fourier  cosine  transform  of  this  function  is 
h(t)  =F(C)(t)  =  Cb2+(t-2^2b3[g22|(tta)2]  , 

2     2 
Inspection  of  h' (t)  reveals  that  if  b   >  a  ,  it  is  nonpositive 

for  all  t,  and  thus  h(t)  is  monotone  decreasing  under  that 

constraint.   The  final  result  is  that  each  of  the  above  C(s)  is 

an  isotropic  positive  definite  function,  hence  is  a  covariance 

function  if  the  appropriate  inequality  on  the  parameters  is 

satisfied. 

(3)  Consider  the  special  case  of  the  damped  cosine  functioi 
C(s)  =  [A  +  (l-A)cos(as)](l  +  (bs)2)"1/2  . 

The  Fourier  transform  of  this  function  is 
h(t)  =  F(C)(t)  = 

(2b)_1{(l-A)[K0(|t-a|/b)  +  KQdt+al/b)]  +  AK^t/b)}  . 
Because  the  modified  Bessel  function  K~  becomes  unbounded  as  the 
argument  tends  to  zero,  for  A  /  1 ,  the  Fourier  transform  must  be 
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increasing  as  t  approaches  a  through  values  smaller  than  a.   For 
t  greater  than  a,  and  possibly  for  some  values  smeller  than  a, 
the  function  is  decreasing.   Thus  the  sufficient  conditions  given 
above  are  not  met,  and  it  is  easy  to  find  configurations  of  (x,y) 
points  and  parameter  values  A,  a,  and  b  for  which  the  resulting 
"covariance"  matrix  is  not  positive  definite.   The  two  dimen- 
sional Fourier  transform  of  C(s)  (the  Hankel  transform  of 

1  /2 
s    C(s)  )  has  so  far  eluded  the  author,  so  it  is  presently 

unknown  if  there  are  parameter  values  (other  than  for  A  -    1) 

which  will  yield  a  positive  definite  function. 

(4)   Consider  the  Bessel  function  J(-)(as).   The  Fourier 

transform  of  this  function  is 


h(t)=F(c,<t)=  &/»/(t«-«,i/» :  2 


a 

a  . 
This  function  is  easily  seen  to  be  monotone  decreasing  for  t>a, 


and  thus  the  Bessel  function  Jq ( as )  is  an  isotropic  covariance 
function  in  two  dimensions.   Note  however,  that  to  apply  the 
above  results  there  are  some  technical  details  that  must  be 
considered  because  of  the  infinite  jump  discontinuity  at  t=a. 

The  above  results  concerning  several  functions  proposed  for 
use  as  isotropic  covariance  functions  in  two  dimensions  is 
useful.   The  lack  of  results  and  empirical  evidence  against  the 
damped  cosine  being  positive  definite  negate  the  results  noted  in 
the  next  section  where  we  see  that  the  fitting  power  of  the 
function  is  very  good.   These  aspects  of  the  function  will  be 
discussed  further  in  the  next  section. 
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3 . 5   Summary 

The  contents  of  this  section  contain  some  useful  information 
for  the  construction  of  positive  definite  functions  and  testing 
of  functions  for  positive  def initeness .   When  possible,  the  two 
dimensional  Fourier  transform  of  C.(s)  can  be  used  to  decide 
whether  or  not  the  function  is  positive  definite.   When  the  two 
dimensional  Fourier  transform  cannot  be  obtained  in  closed  form, 
Theorem  2  can  give  some  information  if  the  one  dimensional 
Fourier  transform  is  available  in  closed  form.   While  the  condi- 
tion given  by  Theorem  2  is  only  sufficient,  and  not  necessary,  it 
has  been  shown  to  be  useful  in  investigating  some  functions  which 
have  been  proposed  for  use  as  isotropic  covariance  functions  in 
two  dimensions. 
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4.0  Some  Experiments  with  Isotropic  Covariance  Functions 

4.1  Introduction 

The  work  reported  in  this  section  was  conceived  to  help 
determine  something  about  the  overall  fitting  properties  of 
various  suggested  covariance  functions.   The  term  "overall 
fitting  properties"  is  meant  to  include  not  only  the  ability  of 
the  function  to  model  a  reasonably  complicated  true  covariance 
function,  but  also  its  performance  when  used  in  a  statistical 
interpolation  scheme  with  several  different  observation  patterns. 

The  approach  for  this  project  was  to  begin  with  published 
actual  data,  and  then  construct  a  "true",  or  model,  covariance 
function,  by  a  least  squares  fit  to  the  data  from  a  certain  class 
of  covariance  functions.   Functions  from  other  classes  would  be 
fit  to  the  same  data,  again  in  the  least  squares  sense,  and  these 
"assumed"  covariance  functions  would  then  have  their  analysis 
performance  measured  against  that  of  the  optimum  model .   I  be- 
lieve the  results  to  be  discussed  give  some  insight  into  what 
classes  of  functions  have  adequate  fitting  ability  for  modeling 
actual  forecast  error  statistics,  and  also  show  how  much  skill  is 
lost  (in  the  idealized  case)  by  use  of  inaccurate  covariance 
functions . 

The  results  given  here  consist  of  plots  of  assumed  cor- 
relation functions  together  with  the  model  (true)  correlation 
function,  plots  of  the  contours  of  expected  errors,  and  tables 
showing  expected  root-mean-square  (erms)  errors  (relative  to  the 
standard  deviation  of  the  first-guess  error)  over  three  grids  of 
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points  and  associated  observation  locations.   The  expected  errors 
were  computed  as  in  Seaman  (1983).   The  results  obtained  with 
various  assumed  correlation  functions  in  the  SI  scheme  are 
discussed  in  detail. 

4 . 2   The  Model  Correlation  Function 

The  data  for  the  covariance  function  was  obtained  (by  hand) 

from  Lonnberg  (1982).   The  data  taken  was  plotted  points  from  a 

covariance  function  of  the  type  used  by  ECMWF,  in  this  case  a 

five  term  (i.e. ,  n=5)  Bessel  series  of  the  form 
n 


(1)   ^  AiJQ(s*ki/R)  +  Aq  , 


i-1 
where  k.  is  the  i    zero  of  the  Bessel  function  J^ ( s )  ,  and  R  is 

the  radius  of  the  region  of  interest.   This  function  is  positive 

definite  as  an  isotropic  function  in  two  dimensions  provided  the 

coefficients  Ai  are  all  positive.   In  Lonnberg,  R  was  2000  km. 

In  this  work,  distance  was  measured  in  degrees,  and  the  radius 

was  scaled  to  30 

Least  squares  fits  to  the  data  by  functions  of  the  type  (1) 

for  four,  five,  and  six  terms  were  computed.   While  the  original 

paper  indicated  a  series  with  five  terms  generated  the  data,  it 

was  found  that  six  terms  yielded  all  positive  coefficients  and  a 

significant  reduction  in  the  residual  over  five  terms.   Thus,  it 

was  decided  to  adopt  the  six  term  series  as  the  model  covariance 

function.   This  six  term  series  would  also  be  marginally  harder 

to  approximate  using  other  classes  of  covariance  functions.   The 

data  and  the  fits  using  four  and  six  terms  are  shown  in  Figure  1, 
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and  the  coefficients  are  given  ,.  Table  ^  ^^  ^  ^   ^ 

The  intercept  values  of  the  approximations  were  Q  ^  ^  Q  ^ 
for  four  end  six  terms ,  respectively    Thia  __  ^^  ^ 

data  represents  the  spatial  correlation  of  the  first-guess  plu£ 

observation  error   +>,„c  +u 

error,  thus  the  intercept  is  a  function  of  the  ratio 

of  the  standard  deviations  of  first-eue„  *»a      u 

Sl  guess  and  observation  error 

The  effects  of  this  Kind  of  discrepancy  will  be  discussed  in 
section  4.4.   The  correlation  function  for  first-guess  error  is 
the  approximation  norraaliZed  to  have  value  one  at  s-0 ,  of  course. 
43  The  Grid  and  Observation  Point  Sets 

Three  grids  and  associated  point  sets  were  selected  for 
studying  the  expected  errors  of  statistical  interpolation  sohemes 
based  on  various  assumed  covariance  functions.   All  were  based  on 
the  approbate  locations  of  radiosonde  data  (from  Wahba  and 
Wendelberger  (1980)  and  Ghil,  et.al.  (1981))  within  ^    ^^ 
««d.   Each  grid  covered  a  region  which  was  30°  i„  longitude  and 
20   in  latitude,  and  the  three  were  chosen  to  represent  a  dense 
observation  set,  a  partially  dense  observation  set,  and  a  sparse 
observation  set.   The  regions  correspond  to  the  central  United 
States  with  36  observations,  the  eastern  United  States  and  west- 

em  Atlantic  Ocean  ui+v,  oc   u 

Ocean  „lth  25  observations,  and  the  middle  Atlantic 

Ocean  with  3  observations.   For  reference  purposes,  the  three 
regions  will  be  referred  to  as  the  MUS  (Mid-US,,  EC  (EaEt  Coast) 
and  MA  (Mid-Atlantic,  regions.   The. regions  and  the  observation  " 
locates  can  be  seen  in  Figures  2-U,  parts  (b),  (<j)>  ^  (d) 
respectively.   The  regions  were  gridded  at  2.5°  intervals  for 
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purposes  of  computing  expected  errors,  although  the  erms  errors 
given  in  Table  1  are  only  over  the  interior  grid  points  to  mini- 
mize edge  effects.   For  contouring  purposes  the  fields  were 
interpolated  to  finer  grids  using  bicubic  spline  interpolation. 

4 . 4   The  Assumed  Correlation  Functions  and  Results 

The  families  of  assumed  correlation  functions  fell  into  5 
classes:   (i)   Bessel  function,  (ii)   negative  squared  exponen- 
tial (sometimes  called  Gaussian),  (iii)   autoregressive  of  order 
two,  (iv)   autoregressive  of  order  three,  and  (v)   damped  cosine 
They  will  be  discussed,  along  with  the  results,  in  turn.   Plots 
of  the  assumed  correlation  functions,  along  with  the  model  corre 
lation  function,  are  shown  in  part  (a)  of  Figs.  2-11.   For  fit- 
ting purposes,  each  included  a  multiplicative  parameter  which 
determined  the  s=0  intercept,  and  was  subsequently  dropped  to 
obtain  the  correlation  function.   The  value  of  this  parameter  is 
of  interest,  however,  because  dropping  it  shifts  the  curve 
(upward)  to  pass  through  the  point  (0,1),  and  thus  different  fits 
may  be  shifted  by  different  amounts,  which  ultimately  affects  the 
fit  to  the  first-guess  error  correlation  function. 

Recall  that  the  erms  errors  given  in  Table  1  are  given  as  a 
fraction  of  standard  deviation  of  the  first-guess  error.   It  was 
assumed  that  the  ratio  of  the  standard  deviation  of  the  observa- 
tion errors  to  the  standard  deviation  of  the  first-guess  errors 
was  1/3. 
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(i)   Bessel  Function 

The  reference  expected  errors  were  computed  using  the  actual 
correlation  function  model,  given  by  Eq.  (1)  with  coefficients  as 
given  in  Table  1 .   The  results  are  given  in  Table  1 ,  and  are  the 
smallest  expected  errors  that  can  be  obtained  using  a  correction 
to  first-guess  scheme,  that  is,  they  are  truly  optimum.   The 
correlation  function  is  shown  in  part  (a)  of  Fig.  2,  while  the 
contour  plots  of  the  expected  error  for  each  of  the  three 
grid/observation  sets  is  shown  in  parts  (b),  (c),  and  (d). 

The  results  using  a  four  term  Bessel  function  are  shown  in 
Fig.  3.   Because  the  intercept  of  the  fit  to  the  data  is  0.8270 
versus  0.8592,  normalization  to  value  one  produces  a  curve  which 
is  then  predominately  above  that  for  the  model  correlation  func- 
tion, especially  for  small  distances.   The  result  of  the  poor 
approximation  for  small  distances  is  most  pronounced  over  MUS ,  as 
seen  in  Fig.  3a.   The  effect  was  small  over  the  sparse  part  of  EC 
and  over  MA. 

(ii)   Negative  Squared  Exponential  (NSE) 

The  NSE  has  been  recognized  as  inadequate  for  modeling  error 
covariances  for  some  time,  and  the  results  obtained  here  confirm 
that.   The  assumed  form  of  the  function  was 


(2)   A  +  (l-A)e"(s/b)2  . 


This  function  is  positive  definite  as  an  isotropic  function  in 
two  dimensions  for  0<A<1. 
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The  initial  fit  was  not  obtained  by  least  squares,  but 
simply  by  attempting  to  fit  the  model  correlation  reasonably  wellj 
for  small  distances,  taking  A=0 .   As  shown  by  Fig.  4a,  the  fit  is 
reasonably  good  up  to  about  6  ,  and  quite  poor  at  greater 
distances.   The  expected  errors  are  similar  in  magnitude  to  the 
expected  errors  for  the  four  term  Bessel  function,  except  over 
MA,  where  the  errors  are  larger.   However,  since  the  errors  over 
MA  tend  to  be  large  anyway,  the  relative  effect  is  not  as  great 
as  one  might  expect. 

The  second  attempt  was  by  least  squares  for  the  parameters  A 
and  b.   Because  the  NSE  is  too  flat  near  the  origin,  this  process 
yielded  an  intercept  value  of  0.8060,  shifting  the  correlation 
function  shown  in  Fig.  5a  so  that  it  is  entirely  above  above  the 
model  correlation  curve.   This  results  in  even  poorer  performance 
over  MUS  and  EC  than  the  previous  model,  due  to  the  inaccurate 
representation  for  small  distances.   The  performance  over  MA  was 
better  than  the  above.   The  contour  plots  of  expected  error  are 
shown  in  Fig.  5. 

Due  to  the  poor  performance  (compared  to  the  above)  obtained 
by  adding  a  constant  to  the  basic  NSE  it  was  decided  to  attempt 
to  find  a  better  fit  by  trial  and  error.   No  claim  is  made  about 
any  optimality  for  this  function.   The  results  are  shown  in  Fig. 
6,  and  these  plots  along  with  the  results  in  Table  1  demonstrate 
that  it  is  probably  not  possible  to  obtain  good  results  overall 
with  a  function  from  the  NSE  family,  and  certainly  not  for  the 
present  model  correlation  function. 
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(iii)   Autoregressive ,  Order  Two  (SOAR) 

The  SOAR  model  has  been  suggested  as  appropriate  by  Thiebaux 

(1985)  and  this  is  supported  by  simulations  due  to  Balgovind, 
et.al.  (1983).   The  model  is  also  the  current  favorite  for  incor- 
poration into  the  U.  S.  Navy  models.   The  formula  given  here 

includes  a  constant  term  which  is  not  part  of  the  SOAR  model,  but 
which  has  been  noted  to  improve  performance  considerably 

(Thiebaux,  et.al.,  1985),  and  those  results  are  confirmed  here. 
The  SOAR  function  with  additive  constant  is 

(3)   A  +  (l-A)[cos(as)  +  (b/a) sin(as ) ]e"bs   . 

This  function  is  positive  definite  (in  two  dimensions)  whenever 
a<b,  and  0 <A< 1 .   In  all  cases  investigated  here,  and  as  has  been 
reported  elsewhere,  (e.g.,  Thiebaux,  et.al.,  1985),  the  parameter 
a  tends  to  be  essentially  zero.   In  this  case  the  function 
reduces  to 

(3a)   A  +  (1-A)[1  +  bs]e"bs   . 

The  initial  attempt  was  a  least  squares  fit  to  the  data  with 
A=0 .   The  intercept  obtained  was  0.7977,  with  the  resulting 
correlation  curve  then  being  considerably  above  the  model  correl- 
ation curve  between  0   and  15  ,  as  is  shown  in  Fig.  7a.   The 
performance  was  only  slightly  better  than  with  any  of  the  prev- 
ious correlation  functions.   It  was  then  decided  to  attempt  a 
least  squares  fit  with  the  intercept  constrained  to  be  0.8592, 
the  same  as  obtained  for  the  model  correlation  function,  but 
again  with  A=0 .   The  results  of  this  calculation  and  the 
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resulting  expected  error  contours  are  shown  in  Fig  8.   Table  1 
shows  marginal  improvement  for  all  three  grid/observation 
patterns.    A  third  attempt  included  A  in  the  least  squares  fit, 
with  no  constraint.   This  resulted  in  a  much  closer  match  to  the 
model  correlation  function,  although  the  intercept  of  0.8441 
moved  the  assumed  correlation  curve  above  the  model  curve  for 
much  of  the  interval.   The  results  are  shown  in  Fig.  9,  and  Table 
1  shows  considerable  improvement  over  all  previous  results,  the 
most  improvement  being  for  MUS ,  and  the  least  for  MA. 

(iv)   Autoregressive,  Third  Order  (TOAR) 

The  use  of  the  TOAR  model  has  been  investigated  by  Thiebaux, 
et.al.  (1985),  including  an  additive  constant.   The  formula  is 

(4)   A  +  (l-A)[(acos(as)  +  bsin( as ) )e"bs  +  ce"CS] 

where  the  coefficients  a,  b,  and  c  are  functions  of  a,  b,  and  c 
(see  Appendix  1  for  the  formulas).   It  is  unknown  what  restric- 
tions (beyond  0<A< 1 )  on  the  parameters  are  required  to  ensure  the 
function  is  positive  definite  as  an  isotropic  function  in  two 
dimensions . 

The  data  was  fit  by  least  squares  with  the  TOAR  function 
(4).   The  intercept  was  0.8651,  which  resulted  in  the  curve  being 
slightly  below  the  model  correlation  curve  over  most  of  the 
range,  although  the  fit  was  quite  close,  better  than  any  of  the 
previously  discussed  functions.   The  results  are  given  in  Fig.  10 
and  Table  1  and  show  very  close  agreement  with  the  optimum  possi- 
ble for  all  three  of  the  grid/observation  sets. 
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(v)   Damped  Cosine 

The  damped  cosine  function  has  been  suggested  by  Thiebaux 
(1976)  and  Seaman  and  Hutchinson  (1985).   The  formula  is 

(5)   [A  +  (l-A)cos(as)]/[l  +  (bs)2]c  . 

It  is  unknown  whether  the  function  is  positive  definite  as  an 
isotropic  function  in  two  dimensions,  but  the  evidence  in  Section 
3.4  (for  c=.5),  while  inconclusive,  seems  to  indicate  it  is  not. 
In  practice,  of  course,  the  function  may  be  positive  definite 
when  the  observation  points  are  restricted  to  certain  regions. 

The  data  was  fit  with  the  function  (5),  under  the  restric- 
tion c=.5.   The  intercept  was  0.8565,  which  resulted  in  a  very 
slight  raising  of  the  curve  relative  to  the  model  correlation 
function.   The  resulting  fit  is  excellent  for  small  distances  and 
very  good  over  the  entire  range,  as  is  seen  in  Fig.  11a.   Table  1 
shows  that  the  best  results  for  any  of  the  assumed  correlation 
functions  are  achieved  here. 

(vi)   Variations 

The  expected  error  computations  for  a  number  of  variations 
of  the  above  functions  were  also  performed.   The  principal 
variation  was  to  fit  the  data  only  over  the  first  half  of  the 
interval,  (0°,15°).   The  effect  of  this  was  to  generally  (though 
not  always)  increase  the  erms  errors  over  MUS  and  EC,  while  not 
affecting  the  results  over  MA.   In  the  damped  cosine,  the  expo- 
nent c  was  chosen  by  least  squares,  along  with  the  other 
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parameters,  and  resulted  in  a  slightly  better  fit  to  the  correla 
tion  function,  especially  at  larger  distances.  However,  the 
coefficient  A  was  slightly  greater  than  1.  Whatever  the  positiv 
definiteness  properties  of  the  function,  having  A>1  will  certain 
ly  make  it  non-positive  definite.  Although  the  graphical  result 
are  not  shown,  the  coefficients  and  erms  errors  are  given  in 
Table  1  for  the  additional  assumed  correlation  functions. 

4.5   Conclusions 

The  principal  conclusion  to  be  drawn  is  that  the  correlatio: 
family  used  in  practical  analysis  should  embody  a  sufficient 
number  of  parameters  to  fit  the  forecast  error  statistics  reason 
ably  well.   Further,  it  is  most  important  that  the  data  be  fit 
accurately  for  small  distances.   In  order  to  ensure  a  better  fit 
for  small  distances,  it  may  be  worthwhile  to  enforce  the  inter- 
cept of  the  correlation  for  the  first-guess  plus  observation 
errors  _if  the  ratio  of  standard  deviations  of  the  two  errors  is 
known  accurately .   The  effect  of  scaling  to  obtain  the  correla- 
tion function,  and  the  apparent  shift  up  or  down  can  possibly  be 
compensated  for  by  artificially  varying  the  ratio  of  first-guess 
to  observation  errors,  as  well,  although  it  seems  more  desirable 
to  enforce  this  ratio  in  the  correlation  function  fitting  process 

As  noted  above,  clearly  the  most  important  region  for  the 
fit  to  the  correlation  function  to  be  accurate  is  for  small 
distances.   Over  the  sparsely  observed  region,  MA,  and  to  a 
lesser  extent  over  the  EC  region,  the  overall  erms  errors  were 
only  slightly  affected  by  the  assumed  correlation  function.   In 
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the  case  of  the  MA  region  it  is  noted  that  the  error  contours  are 
relatively  unaffected  except  near  the  observation;.   Since  the 
errors  in  the  remote  part  of  the  region  dominate  the  overall 
error,  the  choice  of  assumed  correlation  function  has  relatively 
small  influence.   On  the  other  hand,  over  the  densely  observed 
region,  an  accurate  fit  at  small  distances  was  most  important. 
The  NSE  correlation  function,  while  not  performing  well,  illus- 
trates the  above  nicely,  as  shown  in  Figs  4  and  6.   Even  though 
the  fit  shown  in  Fig.  4a  is  poor  for  distances  of  more  than  6  , 
compared  to  the  fit  in  Fig.  6a,  the  erms  errors  over  MUS  and  EC 
are  smaller  due  to  the  more  accurate  fit  for  small  distances  by 
the  former  function.   Of  course  the  erms  errors  over  MA  are 
poorer  due  to  the  very  bad  fit  at  large  distances  in  Fig.  4a. 

There  appear  to  be  several  good  candidates  for  use  as  two 
dimensional  isotropic  correlation  functions,  including  SOAR, 
TOAR,  and  damped  cosine,  given  by  Eqs .  (3),  (4),  and  (5), 
respectively.   While  the  fitting  power  for  the  latter  two  are 
greater  (there  are  a  greater  number  of  parameters  for  those  two), 
the  choice  of  SOAR  seems  reasonable  and  adequate  for  a  number  of 
reasons'.   (1)   The  SOAR  (with  the  additive  constant)  embodies  a 
sufficient  number  of  parameters  to  allow  oscillation  and  decay 
with  distance.   (2)   The  SOAR  has  some  credibility  as  the  spatial 
correlation  function  of  an  innovation  process,  but  in  one  dimen- 
sion rather  than  two,  and  further  the  only  evidence  that  it  might 
be  near  the  right  one  is  that  cited  previously  in  Balgovind, 
et.al.  (1983).   (3)   The  SOAR  was  demonstrated  here  to  be  posi- 
tive definite  as  an  isotropic  function  in  two  dimensions,  under 
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what  seems  in  the  practical  case  to  be  a  mild  restriction  on  the 
parameters.   (4)   While  the  TOAR  is  also  the  spatial  correlation 
function  (again  in  one  dimension)  for  an  innovation  process, 
based  on  this  limited  study  it  does  not  appear  to  be  signifi- 
cantly better  than  SOAR.   (4)   The  positive  definiteness  proper- 
ties of  the  TOAR  are  not  known,  although  it  seems  reasonable  to 
speculate  that  it  is  positive  definite  as  an  isotropic  function 
in  two  dimensions  under  some  restrictions  on  the  parameters.   (5) 
While  the  fitting  ability  of  the  damped  cosine  seems  to  be  at 
least  as  good  as  the  TOAR,  and  while  it  is  positive  definite  in 
one  dimension,  evidence  indicates  it  may  not  be  positive  definite 
as  an  isotropic  function  in  two  dimensions,  regardless  of  parame 
ter  restrictions.   The  availability  of  other  acceptable  alterna- 
tives seems  to  make  it  prudent  to  preclude  the  use  of  the  damped 
cosine  in  practical  situations. 

Finally,  it  is  pointed  out  that  all  of  the  functions  except 
the  four  term  Bessel  function  and  the  NSE  perform  very  well. 
Table  1  shows,  for  example,  that  the  SOAR  is  only  a  little  more 
than  1%  of  the  standard  deviation  of  the  first-guess  error  poorei 
than  optimal  over  MUS  and  EC,  and  less  than  .1%  poorer  over  MA. 
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Parameter  Values  and  erms  Errors 
Model  Correlation  is  Six  Term  Bessel 


Assumed 
Correlation 
Function 


Parameters 
a, b , c     A,  A 


MUS 
0.2667 


erms 
EC 


MA 
0.7483 


6  T  Bessel 

0 
0 
0 
0 
0 
0 
0 

.2474 
3335 

.  1844 
1031 

.0362 
0554 

.0400 

4  T  Bessel 

0 
0 
0 
0 
0 

2811 
3090 
2213 
0930 
0956 

NSE 

10 

.0 

0 

0 

NSE 

14 

88 

0 

3200 

NSE 

10 

.0 

0 

2500 

SOAR 

0 

0 

0 
0 

0 
1215 

SOAR 

0 
0 

0 
.  1374 

0 

0 

SOAR 

0 
0 

.0 
2055 

0 

2722 

TOAR 

0 
0 
0 

4732 
3828 
0914 

0 

1974 

Dmpd  Cos 

0 
0 
0 

.4749 
1367 
5000 

0 

9592 

NSE. 
NSE*. 

15 

0 

0. 

0 

12 

31 

0. 

3205 

SOAR 

0 

0 

0. 

3758 

TOAR* 

0 

2654 

0 

4468 

-5. 

9965 

0. 

1482 

Dmpd  Cos 

0 

0052 

1 

2236 

1. 

0027 

0. 

1507 

0 

5000 

Dmpd  Cos 

0 

0. 

7009 
2069 

1. 

0105 

* 
Dmpd  Cos 

0 

3753 

0. 

7987 

1. 

0147 

0. 

2350 

0. 

3317 

0.3752 


0.3046 


0.3047 
0. 3688 
0.3098 
0.3034 

0.2931 

0.2780 

0.2717 

0.2686 


0.3619 
0.3474 
0.2743 

0.2697 


0.2749 


0.2692 


0.2706 


0.4088    0.7503 


0.4184 
0.5282 
0.4158 
0.4022 

0.3968 

0.3859 

0.3794 

0.3779 


0.4414 
0.4299 
0. 3825 

0.3801 


0.3734 


0.3779 


0.7822 
0.7631 
0.7541 
0.7562 

0.7562 

0.7491 

0.7485 

0.7486 


0.7649 
0.7593 
0.7495 

0.7514 


0.7491 


0.7484 


0.3784    0.7485 


Table  1 

These  correlation  functions  were  obtained  by  least  squares 
fit  over  the  interval  (0°, 15°) 
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Figure  Captions 

Figure  1:  The  data  points  and  least  squares  fits  by  four  and  si> 
term  Bessel  functions. 

Figure  2:   (a)   Six  term  Bessel  correlation  function  (true  and 
assumed).   (b)   Expected  root-mean-square  error  contours  for  the 
grid  and  observation  point  set  for  the  correlation  functions  show 
in  (a).   (c)   As  in  (b)  for  the  EC  grid  and  observation  point  set 
(d)   As  in  (b)  for  the  MA  grid  and  observation  point  set. 

Figure  3:  (a)  Six  term  Bessel  correlation  function  (true)  and  f 
term  Bessel  correlation  function  (assumed).  (b),  (c),  and  (d)  A 
in  corresponding  parts  of  Fig.  2  for  the  correlation  functions  sh 
in  (a). 

Figure  4:   (a)   Six  term  Bessel  correlation  function  (true)  and 
negative  squared  exponential  correlation  function  obtained  by  visi 
fit  for  small  distances  (assumed).   (b),  (c),  and  (d)   As  in  corr 
sponding  parts  of  Fig.  2  for  the  correlation  functions  shown  in  ( 

Figure  5:   (a)   Six  term  Bessel  correlation  function  (true)  and 
negative  squared  exponential  plus  constant  correlation  function 
obtained  by  least  squares  fit  (assumed).   (b),  (c),  and  (d)   As  i 
corresponding  parts  of  Fig.  2  for  the  correlation  functions  shown  I 
(a). 

Figure  6:   (a)   Six  term  Bessel  correlation  function  (true)  and 
negative  squared  exponential  plus  constant  correlation  function 
obtained  by  visual  fit  (assumed).   (b),  (c),  and  (d)   As  in  correi 
sponding  parts  of  Fig.  2  for  the  correlation  functions  shown  in  («< 

Figure  7:   (a)   Six  term  Bessel  correlation  function  (true)  and 
second  order  autoregressive  correlation  function  obtained  by  leastj 
squares  (assumed).   (b),  (c),  and  (d)   As  in  corresponding  parts  c 
Fig.  2  for  the  correlation  functions  shown  in  (a). 

Figure  8:   (a)   Six  term  Bessel  correlation  function  (true)  and 
second  order  autoregressive  correlation  function  obtained  by  least 
squares  with  intercept  constaint  (assumed).   (b),  (c),  and  (d)   As 
in  corresponding  parts  of  Fig.  2  for  the  correlation  functions  shci 
in  (a). 

Figure  9:   (a)   Six  term  Bessel  correlation  function  (true)  and 
second  order  autoregressive  plus  constant  correlation  function 
(assumed).   (b),  (c),  and  (d)   As  in  corresponding  parts  of  Fig. 
for  the  correlation  functions  shown  in  (a). 

Figure  10:   (a)   Six  term  Bessel  correlation  function  (true)  and 
third  order  autoregressive  plus  constant  correlation  function 
(assumed).   (b),  (c),  and  (d)   As  in  corresponding  parts  of  Fig. 
for  the  correlation  functions  shown  in  (a). 

Figure  11:   (a)   Six  term  Bessel  correlation  function  (true)  and 
damped  cosine  with  exponent  1/2  correlation  function  (assumed), 
(b),  (c),  and  (d)   As  in  corresponding  parts  of  Fig.  2  for  the 
correlation  functions  shown  in  (a). 
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Appendix  1 

This  appendix  lists  many  of  "the  functions  proposed  for  use 
as  correlation  functions  in  SI  during  the  past  15  or  20  years. 
Many  have  been  mentioned  by  several  authors,  and  there  does  not 
seem  to  be  any  standard  notation  for  parameters.   This  listing 
will  attempt  to  be  somewhat  consistent  in  the  meaning  of  various 
parameter  symbols  used  in  defining  the  functions.   Of  course, 
this  kind  of  goal  cannot  be  completely  successful ,  and  it  is  not 
in  this  case.   In  addition,  the  attempt  results  in  possible 
confusion  in  certain  cases  where  the  role  of  symbols  used  in  the 
cited  literature  is  interchanged  with  that  adopted  here.   None- 
theless, it  seemed  to  be  useful  to  try  to  be  consistent,  although 
reasonable  persons  can  certainly  disagree  on  what  that  means. 

The  symbols  used  in  this  compendium  have  meanings  as 

described  here. 

a      -  oscillation  parameter 

b      -  decay  parameter  (primary  or  one  in  oscillatory  term) 

c,d,e  -  other  nonlinear  parameters 

A,  A.  -  linear  coefficients 

s      -  distance,  assumed  positive  to  ensure  symmetry 

Finally,  while  the  list  is  extensive,  no  claim  is  made  that 
it  is  exhaustive  or  complete.   Apologies  are  made  in  advance  to 
those  authors  whose  suggestion  of  any  of  one  of  the  below  pre- 
dated all  the  references. 
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Function  References 


e 


(s/b)2  Bergman  (1979) 

Buell  (1972) 


Gandin  (1963) 

Lorenc  (1981) 

McPherson  (1982) 

Seaman  &.  Hutchinson  (1985) 

Schlatter,  et.al.  (1976) 

Thiebaux  (1976) 


2.  (l-A)e  (s/b)2  +  A  Thiebaux,  et.al.  (1985) 

3.  (1  +Z  Aisi)e"(s/b)2  Buell  (1972) 

i 

4.  cos(as)e~bs  Creutin  &  Obled  (1982) 

5.  [A  +  (l-A)cos(as)]e"bs  Seaman  &  Hutchinson  (1985) 

6.  (1  +  bs)e~bs  Buell  (1972) 

Seaman  &.  Hutchinson  (1985) 
Thiebaux,  et.al.  (1985) 
Yudin  (1961) 

7.  [cos(as)  +  (b/a)sin(as)]e"bs        Thiebaux  (1985) 

8.  A+(l-A)[cos(as)+(b/a)sin(as)]e"bs   Thiebaux,  et.al.  (1985) 

9.  [acos(as)+bsin(as)]e"bs+ce"cs       Thiebaux,  et.al.  (1985) 

*  (a,b,c )  =  G(a,b,c) 

10.  A+(l-A)[acos(as)+bsin(as)]e~bs+ce~cs]  Thiebaux,  et.al.  (1985) 

*(a,b,c)=G(a,b,c) 

11.  [A  +  (l-A)e~bs°cos(as)  Alaka  &  Alvander  (1972) 

Steinitz,  et.al.  (1971) 

12.  [A  +  (l-A)cos(as)](l  +  (bs)2)~c     Seaman  &  Hutchinson  (1985) 

Thiebaux  (1976) 

13.  [A  +  (l-A)cos(as)]e~bs°  Thiebaux  (1975) 

14.  [A  +  (l-A)J0(as)]e"bs°  Thiebaux  (1975) 

0 
16.  La. 


15.   [A  +  (l+a)Jn(as](l  +  (bs)2)  c      Thiebaux  (1975) 


•J0(k.s/R)  +  Afl  Rutherford  (1972) 

i  R  -  radius     Lonnberg  (1982) 

k.  -  ith  root  of  JQ 


63 


18.  (1  +  (bs)2)~c  Buell  (1972) 

19.  (1  +  cs  +  a(cs)2)e"bs  Buell  (1972) 

20.  sech(bs)  Buell  (1972) 

21.  sin(as)/(as)  Buell  (1972) 

22.  K1(bs)/(bs)  Buell  (1972) 

23.  21/3r(2/3)(bs)2/3K2/3(bs)  Buell  (1972) 

24.  bQ-^8  ~  ce-bs  Buell  (1972) 

b  -  c 


*  a  =  (3b2-a2-c2)ac/D 
b  =  (b2-3a2-c2)bc/D 
c  =  -2(b2+a2)ab/D 
where  D  =  ( 3b2-a2-c2ac-2 (b2+a2 )ab 
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Appendix  2 

This  appendix  contains  the  results  of  some  experiments  in 
generating  simulated  raw  statistics  for  first-guess  plus  observa- 
tion errors.   In  many  respects  the  data  appears  very  similar  to 
that  contained  in  the  literature,  e.g.  Julian  and  Thiebaux 
(1975),  Schlatter  (1975),  Thiebaux  (1976),  and  Thiebaux,  et.al. 
(1985).   However,  some  interesting  observations  can  be  made  when 
the  simulations  involve  different  sets  of  realizations,  and 
especially  when  different  numbers  of  realizations  are  involved. 

First,  it  was  desired  to  continue  the  use  of  the  six  term 
Bessel  function  as  the  model  first-guess  correlation  function  and 
to  use  the  grid/observation  set  MUS  as  the  basis  for  the 
simulation.   This  presented  some  difficulty,  as  the  generation  of 
first-guess  errors  at  the  grid  points  (9x13=117  of  them)  with  the 
specified  spatial  error  covariance  function  requires  factoring 
the  desired  spatial  covariance  matrix  into  the  form   A=LL  .   This 
proved  to  be  impossible  to  do  because  the  matrix  was  very  ill 
conditioned.   Just  a  little  "tweaking"  of  the  matrix  can  result 
in  a  relatively  well  conditioned  one.   This  process  is  simply 
that  of  adding  a  "small"  constant  to  the  diagonal  term,  which 
shifts  the  eigenvalues  away  from  zero  by  that  amount,  hence 
quickly  alleviating  the  ill  conditioning.   However,  this  also 
results  in  having  a  first-guess  error  correlation  which  has  a 
Kronecker  delta  function  at  the  origin.   The  result  is  that  when 
the  first-guess  error  is  interpolated  to  the  observation  points, 
it  spreads  this  additional  term  throughout  the  region.   The 
effect  of  this  perturbation  must  be  assessed  before  it  can  be 
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concluded  that  the  subsequent  results  are  valid.  Therefore,  we 
first  consider  the  process,  and  the  effect  of  adding  a  constant 
to  the  diagonal  of  the  covariance  matrix. 

Each  realization  of  the  process  consists  of  the  following 
steps,  assuming  the  spatial  correlation  matrix  has  been  previous- 
ly factored  to  supply  as  input  to  the  random  number  generator  (I 
used  GGNSM  from  the  IMSL  library).   Normally  distributed  first- 
guess  errors  with  the  specified  spatial  correlation  and  standard 
deviation  are  generated.   Independent,  normally  distributed 
errors  with  specified  standard  deviation  are  generated  for  the 
observation  points.   The  first-guess  error  function  is  then 
interpolated  to  the  observation  points,  and  the  first-guess  minus 
observation  error  computed  for  the  observation  locations.   Then 
the  products  of  these  errors  are  computed  for  each  possible  pair 
of  observation  locations. 

This  process  is  then  repeated  for  a  given  number  of  realiza- 
tions, and  the  average  value  of  the  product  for  each  pair  of 
stations  computed.   This  results  in  the  matrix  (of  order  equal  to 
the  number  of  observation  locations)  of  empirical  covariances 
over  the  set  of  realizations.   Only  the  upper  triangular  part  is 
computed.   The  empirical  standard  deviation  for  each  location  is 
the  square  root  of  the  corresponding  diagonal  term  of  the  matrix, 
and  the  correlation  matrix  is  obtained  by  dividing  each  row  and 
column  by  the  corresponding  standard  deviations.   The  entries  in 
this  matrix  are  the  data  plotted  in  parts  (a)  and  (c)  of  Figs. 
A1-A4  as  a  function  of  distance  between  observation  locations. 
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This  data  is  then  averaged  over  . 5   intervals  by  simply  taking  an 
arithmetic  average  of  the  values  in  each  interval  [  5(n-l),.5n) 
for  n=l, . .  .   This  is  the  data,  plotted  at  the  midpoints  of  the 
given  intervals,  in  parts  (b)  and  (d)  of  Figs.  A1-A4. 

As  noted  above,  the  correlation  matrix  tends  to  be  ill 
conditioned  in  the  case  of  interest.   The  following  analysis 
shows  that  the  simple  expedient  of  adding  a  small  term  to  the 
diagonal  does  not  perturb  the  resulting  empirical  correlation 

matrix  too  much.   Suppose  that  the  desired  isotropic  spatial 

2 
covariance  function  for  the  first  guess  errors  is  a„C(s)  and  the 

2 
variance  of  the  observation  errors  is  o    .       The  covariance 

o 

function  to  be  simulated  is 

E[(I(ef)  -  eo)(p)(I(ef)  -  eQ)(q)]  =  cr|c(s)  +  a26(s)  , 
where  I(e~)  is  the  first-guess  error  function,  obtained  by 
interpolation  (the  errors  of  this  process  are  relatively  small 
for  this  case,  and  ignored)  to  the  observation  locations,  e   is 
the  observation  error  function,  6( s )  is  the  Kronecker  delta 
function  (zero  except  at  s=0,  where  it  is  1),  and  s  is  the 
distance  between  observation  points  p  and  q.   Division  by  the 
variance  yields  the  correlation,  which  for  s>0  is 

(Al)   R  C(s)  , 

where  R  =  a?/(af  +  a2 ) 
xf     o 

Now,  suppose  the  first-guess  error,  &~  ,  has  spatial  covari- 

2         2 
ance  a_C(s)  +  c  6(s).   This  first-guess  error  can  be  viewed  as 

the  sum  of  the  previous  errors  and  independent,  uncorrelated 

errors  at  the  grid  points.   Then,  the  interpolation  of  the  first- 
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guess  error  gives  the  sum  of  the  interpolation  functions  for  the 
previous  error  (the  interpolation  assumed  to  be  without  error) 
and  for  the  independent  errors  at  the  grid  points,  due  to  the 
additional  diagonal  term.   The  covariance  function  for  linear 
combinations  (obtained  by  interpolation  to  points  p  and  q)  of 
independent  random  variables,  r. ,  with  variance  c  ,  is 

E[(Lb.  (p)r.  )(£b.(q)r.)]  =  c2  Eb.  (p)b.  (q)  =  c2B(p,q)  . 
i  jJ     J         i 

This  function  is  seen  to  be  nonstationary ,  with  variance 

2 
c  B(p,p).   Fortunately,  the  function  B(p,q)  is  rather  well  be- 
haved in  our  case.   The  interpolation  is  by  piecewise  bicubics, 
for  which  the  coefficients  are  usually  less  than  one,  except  in 
the  boundary  rectangles,  where  they  may  be  slightly  larger  than 
one.   In  any  case,  the  value  of  the  B(p,q)  lies  between  zero  and 
1.4,  and  is  zero  except  when  the  points  p  and  q  are  close  enough 
together  to  use  some  of  the  same  grid  point  values  for  the 
interpolation.   This  only  happens  when  p  and  q  are  within  about 
15  ,  and  B(p,q)  will  be  quite  small  unless  they  are  in  the  same 
or  adjacent  rectangles  of  the  grid. 

Thus,  the  covariance  function  for  the  process  is 
E[I(Sf-eo)(p)(ef-eo)(q)]  =  a2C(s)  +  c2B(p,q)  +  a26(s)  . 
After  division  by  the  standard  deviations  at  points  p  and  q,  the 
resulting  correlation  function,  for  s>0,   is 

(A2)   [C(s)  +  c2B(p,q)]/(a|  +  c2B*(p,q)  +  a2)  , 
where  B  (p,q)  is  the  value  which  satisfies  the  equality 
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a2  +  c2B*(p,q)  +  a2  = 

[(a2  +  c2B(p,p)  +  o2q)(o*    +  c2B(q,q)  +  ,  2)]1/2  . 
The  value  of  B  (p,q)  corresponds  to  a  kind  of  geometric  average 
of  the  two  values  B(p,p)  and  B(q,q)  and  lies  between  them,  hence 
is  bounded  by  the  bound  on  B(p,q). 

The  function  in  Eq.  A2  must  be  compared  with  that  given  by 
Al .   After  some  manipulation,  it  is  possible  to  write  the 
expression  A2  as 

(A3)   RC(s)  -  Rc2[(RB*(p,q)C(s)-B(p,q))/(l+c2RB*(p,q)]  , 
where  Q      -    c  /o„ . 

Thus  the  correlation  functions  for  the  two  processes  differ 
by  the  second  term  of  Eq.  A3.   Since  B  (p,q)  and  B(p,q)  are  of 

reasonable  magnitude,  the  effect  is  seen  to  be  on  the  order  of 

2 

the  relative  amount  added  to  the  diagonal  term,  c  ,  which  in  the 

present  case  was  about  0.0001 

To  check  on  the  computational  validity  of  the  above-,  some 

2 
simulations  were  run  using  c   =  900,  a  large  value  for  the  other 

parameter  values,  given  in  the  next  paragraph,  and  one  that 
should  affect  the  results  significantly.   The  empirical  correla- 
tion function  should  be  diminished  by  about  a  factor  of  about 
0.43C(s)  for  "large"  distances,  and  less  for  small  distances. 
This  behavior  was  verified. 

The  above  analysis  allows  the  conclusion  that  adding  a 
"small"  amount  to  the  diagonal  of  the  correlation  matrix  will  not 
seriously  affect  the  empirical  correlation  data  obtained.   For 
purposes  of  the  experiments  reported  on  below,  c  =  0.1,  with 
tff  =  30,  and  ctq  =  10  for  R  =  .9  .   The  model  correlation  function 
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adopted  was  the  six  term  Bessel  function  of  section  4,  with 

2 
c   =0.01  added  to  the  diagonal  for  purposes  of  generating  the 

first-guess  error  correlation  matrix. 

Part  (a)  of  Figs.  A1-A4  show  the  raw  spatial  correlation 
data  for  first-guess  plus  observation  error  for  the  separation 
between  pairs  of  the  36  observation  points,  where  100  realiza- 
tions of  the  process  were  averaged.   Each  figure  corresponds  to 
realizations  with  different  initial  seed  numbers  for  the  random 
number  generator.   Part  (c)  of  the  same  figures  show  the  corre- 
sponding data  when  500  realizations  were  averaged,  400  more  being 
done  in  addition  to  those  of  part  (a).   Parts  (b)  and  (d)  give 
the  results  after  averaging  the  data  of  parts  (a)  and  (c), 
respectively,  over  .5°  intervals,  along  with  a  plot  of  the  model 
correlation  function. 

Several  comments  are  in  order.   The  data  of  parts  (a)  and 
(b)  (less  the  actual  correlation  plot)  are  reminiscent  of  that  in 
previously  mentioned  papers  in  that  the  data  is  qualitatively 
similar,  and  is  over  about  the  same  number  of  realizations,  i.e., 
averages  for  one  quarter  at  a  given  time  of  day  would  result  in 
slightly  fewer  realizations.   Given  the  much  less  scatter  in 
part  (c)  and  the  far  better  agreement  between  the  simulated  data 
and  the  model  correlation  function  in  part  (d),  it  is  easy  to 
conclude  that  it  is  possible  that  far  more  than  100  realizations 
are  necessary  to  obtain  a  good  estimate  of  the  correlation 
function.   Of  course,  there  are  many  other  variables  which  will 
tend  to  increase  the  scatter  in  any  real  data  analog  of  part  (a) 
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of  the  figures,  among  them  anisotropy  and  nonstationarity  of  the 
first-guess  error  correlation  and  first-guess  and  observation 
error  variances.   The  effect  of  this  on  the  real  data  analog  of 
part  (b)  is  not  easily  discernible,  and  in  any  case  will  depend 
on  the  way  the  observation  locations  are  distributed.   In  this 
context,  the  "out  of  place"  points  in  part  (d)  of  the  figures, 
near  distances  of  11   and  16   may  be  due  to  the  fact  that  far 
fewer  observation  point  pairs  happened  to  fall  into  that  distance 
interval  than  for  adjacent  intervals.   This  phenomenon  also  ac- 
counts for  the  increasing  scatter  at  greater  distances,  where 
there  are  few  observation  pairs  so  far  apart.   Although  the 
scatter  is  less,  the  same  phenomenon  occurs  at  small  distances, 
and  lack  of  data  at  small  distances  makes  accurate  estimation  of 
the  function  a  problem,  as  was  noted  in  section  4. 
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Figure  Captions 

Figure  Al :  (a)  Scatter-plot  of  empirical  correlation  between 
first-guess  minus  observed  values  as  a  function  of  distance  for  li) 
realizations  using  the  MUS  grid  and  observation  point  set.  (b) 
Interval  averaged  scatterplot  of  the  data  in  part  (a)  and  the  tru 
correlation  function  for  first-guess  minus  observed  errors.  (c)  \ 
in  (a),  but  for  an  additional  400  realizations.  (d)  As  in  (b),  li 
for  the  data  in  part  (c). 

Figure  A2:   Same  as  Fig.  1  for  a  different  set  of  realizations 

Figure  A3:   Same  as  Fig.  1  for  a  different  set  of  realizations 

Figure  A4:   Same  as  Fig.  1  for  a  different  set  of  realizations 
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